Benchmarking Growing Season Length #3
Merged
+174
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current computation of "growing season length" in xclim uses enormous amounts of memory and usually fails with large datasets. I tested some other method to compute the thing and results are good, but less incredible than the last two benchmarks made this way.
Two methods:
xc.run_length.first_run
calls.For the second case, I tested a lot of different versions, to try and pinpoint what was responsible for the memory consumption. The best way, is
exp_firstruncheck
.Graphs:
Small chunks (50x50) and many years (99).
Large chunks (200x200) and fewer years (50).
Conclusion is that the default version with small tweaks can be sped up and made to take less memory. But, the method with
first_run
while being slower, consumes a lot less memory and does so more stabily.I yet have to test with data that has chunks smaller than a year. More to come.